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Abstract 

Polynomial chaos expansions (PCE) allow us to propagate uncertainties in the 
coefficients of differential equations to the statistics of their solutions. Their main ad¬ 
vantage is that they replace stochastic equations by systems of deterministic equations. 
Their main challenge is that the computational cost becomes prohibitive when the di¬ 
mension of the parameters modeling the stochasticity is even moderately large. We 
propose a generalization of the PCE framework that allows us to keep this dimension 
as small as possible in favorable situations. For instance, in the setting of stochastic 
differential equations (SDEs) with Markov random forcing, we expect the future evolu¬ 
tion to depend on the present solution and the future stochastic variables. We present 
a restart procedure that precisely allows PCE to depend only on that information. The 
computational difficulty then becomes the construction of orthogonal polynomials for 
dynamically evolving measures. We present theoretical results of convergence for our 
Dynamical generalized Polynomial Chaos (DgPC) method. Numerical simulations for 
linear and nonlinear SDEs show that it adequately captures the long-time behavior of 
their solutions as well as their invariant measures when the latter exist. 

Keywords: Polynomial chaos, stochastic differential equations, uncertainty quantifica¬ 
tion, Markov processes 


1 Introduction 

Polynomial chaos (PC) is a computational spectral method that is used to propagate and 
quantify uncertainties arising in the modeling of physical systems. The method originated 
from the works of Wiener [33] and Cameron and Martin [6] on the decomposition of func¬ 
tionals in a basis of Hermite polynomials of Gaussian random variables. Ghanem and 
Spanos m used such an expansion to solve stochastic equations with random data. Xiu 
and Karniadakis m bh introduced generalized polynomial chaos (gPC) involving non- 
Gaussian random parameters. Extensions to arbitrary probability measures followed in 
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the works [29] [32]; see also mm for recent theoretical developments and a convergence 
result for gPC expansions. 

Polynomial chaos expansions (PCE) provide an explicit expression of quantities of 
interests as functionals of the underlying uncertain parameters and in some situations 
allow us to perform uncertainty quantifications at a considerably lower computational 
cost than Monte Carlo methods. However, they suffer from the curse of dimensionality 
and thus typically work efficiently for systems involving low dimensional uncertainties. A 
related major drawback appears in the long-term integration of evolution equations. In 
the presence of random forcing in time, the number of stochastic variables in the system 
increases linearly with time and thus quickly becomes overwhelming. Moreover, standard 
PCE utilize orthogonal polynomials of the initial distribution and as time evolves, the 
dynamics deviate from the initial data substantially (e.g., due to nonlinearities) and the 
solution may become poorly represented in the initial basis. This led the authors in |5] 
to legitimately question the usefulness of (standard) PCE to address long-term evolution 
properties of solutions such as intermittent instabilities. This paper aims to address the 
aforementioned drawbacks. 

We propose a PC-based method that constructs evolving chaos expansions based on 
polynomials of projections of the time dependent solution and the random forcing. More 
precisely, chaos expansions at each iteration are constructed based on the knowledge of the 
moments of underlying distributions at a given time. In the setting of dynamics satisfying 
a Markov property, we crucially exploit this feature to introduce projections of the solution 
at prescribed time steps that allow us to ’’forget” about the past and as a consequence keep 
the dimension of the random variables fixed and independent of time. Our chaos basis is 
adapted to the evolving dynamics with the following consequences: (i) our expansion retains 
its optimality for long times; and (ii) the curse of dimensionality is mitigated. Notably, 
we establish, with appropriate modifications, theoretical convergence analysis which sheds 
light on our numerical findings. Further, asymptotic analysis for computational complexity 
will also be discussed. Inspired by examples of [sum, we will apply our method to a 
nonlinear coupled system of stochastic differential equations (SDEs) and its variants. 

Other iterative methodologies have already been proposed in miSlElE] in various 
contexts of random differential equations (RDEs). The main novelty of our approach is that 
it allows us to compute long-term solutions of evolution equations with complex stochastic 
forcing, which here will be a Brownian motion but could be generalized to more complex 
models. White noise-driven evolution equations have important roles in modeling the small 
scale effects and certain uncertainties in many applications such as turbulence, filtering, 
mathematical finance, and stochastic control mmmm- 

The plan of the paper is as follows. Section [2] introduces background material and 
establishes the necessary notation for PCE. Our methodology, called Dynamical generalized 
Polynomial Chaos (DgPC), is described in detail in section [3l Numerical experiments 
comparing DgPC to Hermite PC and Monte Carlo simulations are presented in section 2J 
Some conclusions are offered in section [5j 


DgPC 


3 


2 Polynomial Chaos 

We briefly introduce the original PC framework. The notation we follow can be found 
in HU |22J EJ. 

Given a probability space (0, F, P) where O is a general sample space, F is a er-algebra 
of subsets and P is a probability measure, we consider L 2 (0, F, P), the space of real-valued 
random variables with finite second moments. Let £ = (£i,£ 2 , • • •) be a countable collec¬ 
tion of independent and identically distributed (i.i.d) standard Gaussian random variables 
belonging to the probability space, and F = <r(£). Then, we define the Wick polynomials 
T Q (£) := where a belongs to set of multi-indices with a finite number of 

nonzero components J = {a = (au, 02 , • • •) | otj € No, |a| = a* < 00 }, H n is the nth 
order normalized one-dimensional Hermite polynomial and No := N U {0}. Note that the 
Wick polynomials are orthogonal to each other with respect to the measure induced by £. 

The Cameron and Martin theorem j6j establishes that the Wick polynomials form a 
complete orthonormal basis in L 2 (Q, F, P). This means that any functional «(•,£) € L 2 , 
can be expanded as 


«(-,£) = E «a(0r a (O> “«(■) = EK-,£)T a (£)], (2.1) 

a£j 

and the sum converges in L 2 . We define E as expectation with respect to P. The expansion 
(12.111 is called polynomial chaos expansion (PCE). We will also use the term Hermite PCE 
to emphasize that the expansion utilizes Gaussian random variables. 

One of the notable features of m is that it separates the randomness in u such 
that the coefficients u a are deterministic and all statistical information is contained in the 
coefficients. In particular, the first two moments are given by E[u] = uo and E[u 2 ] = 
\ u a\ 2 - Higher order moments may then be computed using either triple products of 
Wick polynomials [351SH [2Tj or the Hermite product formula [Ms 22]; see section 13.2.11 
In numerical computations, the doubly infinite expansion (|2.1I1 is truncated so that it 
becomes a finite expansion 


K 

U « uk,n(-,€i, ■ ■ ■ ,€k) '■= y: Uq(-) TT ( 2 - 2 ) 

\a\<N »= 1 

where we simply used polynomials up to degree N in the variables (£ 1 , £2 ■ ■ ■, £k)- Through¬ 
out the paper, we use the graded lexicographic ordering for multi-indices; see m Table 
5.2], 

In the context of white noise-driven SDEs, the random variables £j can be obtained 
by the projection £j = f* m,i(s) dW(s), where W(s) is a Brownian motion on [0,f] and mi 
is a complete orthonormal system in L 2 [0,f]. Moreover, £ = (£j)j is comprised of i.i.d. 
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standard Gaussian random variables, and the expansion 


OO «£ 

V] & / rrii(T)dT 
i =1 Jo 


(2.3) 


converges in L 2 for all s < t and has the error bound 


E 


K 

i =1 




(2.4) 


provided the mj’s are trigonometric polynomials; see (14.21) and [22, T9j. Here, Brownian 
motion (W(s), 0 < s < t} is projected onto L 2 [0,f] for a fixed time t > 0 so that the 
corresponding Wick polynomials T Q (£) depend implicitly on t . 

The simple truncation (12.21) leads to { K ^ N ) terms in the approximation. Thus, the 
computational cost increases rapidly with high dimensionality, which in turn decreases the 
efficiency of PCE. This is the “curse of dimensionality”. Another related major problem 
of PC is that expansions may converge slowly and even fail to converge for long time 
evolutions mmmum- These problems led to numerous extensions of the Hermite 
PCE, which we now briefly discuss. 

The paper m proposed a method called generalized polynomial chaos (gPC), in which 
the above random variables £ have specific, non-Gaussian, distributions in the Askey fam¬ 
ily [37, 34], Further generalizations to arbitrary probability measures beyond the Askey 
family were proposed in [32], where the probability space is decomposed into multiple ele¬ 
ments and chaos expansions are employed in each subelement. The approach taken in m 
is to use a restart procedure, where the chaos expansion is restarted at different time-steps 
in order to mitigate the long-term integration issues of chaos expansions. Another general¬ 
ization toward arbitrary distributions is presented in [26], using only moment information 
of the involved distribution with a data-driven approach. 


3 Description of the proposed method 

Our key idea is to adapt the PCE to the dynamics of the system. Consider for concreteness 
the following SDE: 


du(s) = L(u) ds + a dW(s), u(0) = uq, s 6 [0,f], (3-1) 

where L(u ) is a general function of u (and possibly other deterministic or stochastic pa¬ 
rameters), W(s) is a Brownian motion, a > 0 is a constant, and uq is an initial condition 
with a prescribed probability density. We assume that a solution exists and is unique on 

[o,*]. 

As the system evolves, a fixed polynomial chaos basis adapted to uq and dW may 
not be optimal to represent the solution u(t) for long times. Moreover, the dimension of 
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the representation of dW increases linearly with time t, which renders the PCE method 
computationally intractable even for moderate values of t. We will therefore introduce an 
increasing sequence of restart times 0 < tj < tj+\ < t and construct a new PCE basis at 
each tj based on the solution u(tj) and all additional random variables that need to be 
accounted for. A very similar methodology with a = 0 was considered earlier in mm- 
When random forcing is present and satisfies the Markov property as in the above example, 
the restarting strategy allows us to ” forget” past random variables that are no longer 
necessary and focus on a significantly smaller subset of random variables that influence the 
future evolution. As an example of application, we will show that the restarting strategy 
allows us to capture the invariant measure of the above SDE, when such a measure exists. 

To simplify notation, we present our algorithm on the above scalar SDE with L{u ) a 
function of u, knowing that all results also apply to systems of SDEs with minor modifica¬ 
tions; see sections 13.2.41 and [4j 

3.1 Formulation 

First, we notice that the solution u(t) of (13.11) is a random process depending on the initial 
condition and the paths of Brownian motion up to time t: 

u = u(t; {W s , 0 < s < t},uo). 

Therefore, recalling the expansion for W s (12.31) . the solution at time t can be seen as a 
nonlinear functional of uq and the countably infinite variables As previously noticed 
in nan], the solution u(t) can be represented as a linear chaos expansion in terms of the 
polynomials in itself and therefore, for sufficiently small later times t+e, e > 0, the solution 
u(t + e) can be efficiently captured by low order chaos expansions in u(t ) everything else 
being constant. Moreover, the solution u(t + e) on the interval 0 < e < Sq depends on 
W\t,t+Eo] an d n °t on values of W outside of this interval. This significantly reduces the 
number of random variables in £ that need to be accounted for. This crucial observation 
clearly hinges upon the Markovian property of the dynamics. Hence, dividing the time 
horizon into small pieces and iteratively employing PCE offer a possible way to alleviate 
both curse of dimensionality and long-term integration problems. 

We decompose the time horizon [0, t] into n subintervals; namely [0, H], [fi, £ 2 ], • • •, [t n -i, t] 
where 0 = to < t\ < ... < t n = t. The idea is then to employ polynomial chaos expansion 
in each subinterval and restart the approximation depending on the distributions of both 
and u(tj) at each tj, 1 < j < n; see Figured] Here, denotes the Gaussian random 
variables required for Brownian forcing on the interval [tj, b/+i]. Throughout the paper, we 
utilize the term T a to represent orthonormal chaos basis involving its arguments. In order to 
establish evolution equations for chaos basis T a (£j), triple products E[T Q () Tp ()T y (£j)] 
are needed. This procedure basically corresponds to computing the coefficients and indices 
in the Hermite product formula (13.31) . 
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The probability distribution of u(t 3 ) does not belong to any classical family such as the 
Askey family in general. The construction of the orthogonal polynomials in u(tj) is there¬ 
fore fairly involved computationally and is based on the general PCE results mentioned 
earlier in the section. At each time step, we compute the moments of u(tj) using its pre¬ 
viously obtained chaos representation and incorporate them in a modified Gram-Schmidt 
method. This is a computationally expensive step. Armed with the orthogonal basis, we 
then compute the triple products in u{tj) to perform the necessary Galerkin projections 
onto spaces spanned by this orthogonal basis and thus obtain evolution equations for the 
deterministic expansion coefficients. Letting Uj := u(tj), equations for the coefficients 
(uj. |-i)a of Uj .|_i are given in the general form 


d{uj-\. i)q. — IE 


T a (£,-, u 3 ) L ^ (u j+ i) p T p (£j , Uj 


ds + a E [ T a (£j , Uj ) dW s \ , 


where a, j3 € J. Before integration of these expansion coefficients in time, Uj is represented 
in terms of its own orthogonal polynomials, which provides a description of the initial 
conditions on that interval. We then perform a high order time integration. Cumulants of 
the resulting solution are then computed to obtain relevant statistical information about 
the underlying distribution. 


Remark 3.1. The proposed methodology does not require the computation of any prob¬ 
ability density function (pdf) and rather depends only on its moments at each restart 
time. The statistical information contained in such moments provides useful data for de¬ 
cision making in modeling and, for determinate distributions, moments characterize the 
distribution uniquely. This aspect will be essential in the proof of convergence in section 

El 

Comparing our method with probability distribution function methods, e.g., the Fokker- 
Planck equation, we note that it essentially evolves the coefficients of orthogonal polyno¬ 
mials of the projected solution in time instead of evolving the pdf. 

Let Zj represents the nonlinear chaos expansion mapping (uj-\, £j_i) to Uj. Our 
scheme may be demonstrated by Figure [1] 


0 

ItQ 


1 1 
Hi 


t2 

U 2 




Figure 1: Propagation of chaos expansions. 
Mathematically, we have 

Uj = Zj(£j_i,Uj—i) = ^ ^ { u j)a T a (£j_ j, Uj— i), j £ N. 
a£j 


(3.2) 
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In simulations, is truncated with finite dimension K E N. Let also IV £ N denote 
the maximum degree of the polynomials in the variables (£j,u(tj)). Here, to simplify, K 
and N are chosen independently of j. The multi-indices having K dimensions and degree 
N belong to the set 77k,N ■= {ot = (aq, a 2 , olk) \ otj € No, |a| < IV}. 

We now present our algorithm, called Dynamical generalized Polynomial Chaos (DgPC), 
in one dimension for concreteness; see also section 13.2.41 

Algorithm Id: DgPC 

- decompose time domain [0, t] = [to, ti] U .... U [t n _i, t] 

- initialize degrees of freedom K , N 

- compute coefficients/indices in Hermite product formula for £ 0 

- integration in time: 

time step tj > 0 : obtain random variable Uj 
calculate moments E[(uj) m ] 
construct orthogonal polynomials Tk(uj ) 
compute triple products E [T^{uj) T[{uj ) T m (uj)\ 
perform Galerkin projection onto span{T Q ,(^, Uj)} 
set up initial conditions for the coefficients ( Uj) a 
evolve the expansion coefficients ( Uj) a 
calculate cumulants 

- postprocessing 

Several remarks are in order. First, since our stochastic forcing has identically dis¬ 
tributed independent increments, we observe that at each subinterval the distribution of 
is the same. Computing and storing (in sparse format) the coefficients for the product 
formula (13.31) only once drastically reduces the computational time. 

At each iteration, the random variable u(tj) = Uj is projected onto a finite dimensional 
chaos space and the next step is initialized with the projected random variable. For a 
d-dimensional SDE system, this projection leads to { K ^ N ) x terms in the basis for 

the subinterval [tj,tj+i]- Therefore, the total number of degrees of freedom used in [0,t] 
becomes n x ( K ^ N ) x ( d (j( r iV ); see also section 1331 We emphasize that small values of I \, N 
are utilized in each subinterval such that computations can be carried out quickly. 

The idea of iteratively constructing chaos expansions for arbitrary measures was con¬ 
sidered earlier in paCEBllIIlElE]. in pH [ 3 ], an iterative method is proposed to solve 
coupled multiphysics problems, where a dimension reduction technique is used to exchange 
information between iterations while allowing the construction of PC in terms of arbi¬ 
trary (compactly supported) distributions at each iteration. A similar iterative procedure 
for Hermite PC was suggested for stochastic partial differential equations (SPDEs) with 
Brownian forcing in the conclusion section of [19] without any mathematical construction or 
numerical example. We also stress an important difference between our approach and m- 
our scheme solely depends on the statistical information, i.e., moments, which are available 
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through chaos expansion, whereas m either requires the pdf of Uj at tj or uses mappings 
to transform Uj back to the original input random variables, which in our problem is high 
dimensional, including the £ variables. 

3.2 Implementation 

We now describe the implementation of our algorithm. 

3.2.1 Moments 

Provided that the distribution of the initial condition u$ is known, several methods allow 
us to compute moments: analytic formulas, quadrature methods, or Monte Carlo sampling. 
Also, in the case of limited data, moments can be generated from raw data sets. Throughout 
the paper, we assume that the moments of the initial condition can be computed to some 
large finite order so that the algorithm can be initialized. 

We first provide the Hermite product formula, which establishes the multiplication of 
two PCE (12.11) for random variables u and v. The chaos expansion for the product becomes 

uv = EE E C(a,/3,7)u«-/^7T3+7 T a(£), (3.3) 

a£j 7GJ7'0</3<q 

with C(a,/3, 7 ) = [(^) (^ 7 ) ( Q_ ^ +7 )] 1 ' /2 , [22], T9] . As our applications include nonlinear 
terms, this formula will be used repeatedly to multiply Hermite chaos expansions. 

To compute moments of the random variable Uj (recall (|3.2p ) at time step tj, j = 
1 ,... ,ra, we use its chaos representation and previously computed triple products recur¬ 
sively as follows. Due to the Markov property of the solution, the normalized chaos basis 
in £j-\ and Uj-\ becomes a tensor product, and thus we can write 

u j = 'y 'X u j)q' 2q' (£ 7 _1, Uj— 1 ) = ^ 'X uj)a,kTq(£j—i)Tk(Uj— i), 
a' a,k 

where a, a' € J and k £ No- Then for an integer m > 1, 
a,k (3,1 

Here, a, (3 € J and k,l € No- Hence, using (13.31) . the coefficients ( vJ - l ) 7 , r , where 7 € J 
and r € No, can be calculated using the expression 

( u ?)'r,r=J2 E E 

olSLJ O<0<7 k,l 

(3.4) 

which allows us to compute moments by simply taking the first coefficient, i.e. E[riT] = 
(u'j 1 ) o,o- Even though the coefficients C( 7 , 6, a) can be computed offline, the triple products 
E [T r {uj)Tk(uj)Ti{uj)\ must be computed at every step as the distribution of Uj evolves. 
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3.2.2 Orthogonal Polynomials 

Given the real random variable Uj obtained by the orthogonal projection of the solution 
u onto homogeneous chaos at tj, the Gram-Schmidt orthogonalization procedure can be 
used to construct the associated orthogonal polynomials of its continuous distribution. For 
theoretical reasons, we assume that the moment problem for Uj is uniquely solvable; i.e., 
the distribution is nondegenerate and uniquely determined by its moments mm- 

To construct orthogonal polynomials, the classical Gram-Schmidt procedure uses the 
following recursive relation 

m —1 

To = 1, T m (uj) = uf - Ttiuj), m > 1, 

1=0 

where the coefficients are given by a™ = E [u™ (E[T 2 (it j)]) 1 . Note that E denotes 

the Lebesgue-Stieltjes integral with respect to distribution of Uj. We observe that the 
coefficient a™ requires moments up to degree 2m — 1 as Ti is at most of degree m — 1. 
Thus, normalization would need first 2m moments for orthonormal polynomials up to 
degree m. 

After the construction of orthonormal polynomials of Uj , the basis T a , Uj ) can be 
generated by tensor products since and Uj are independent. Moreover, triple products 
E [T k (v,j) Ti(uj) T m {uj)\. where k, l, m = 0,..., N, can be found by simply noting that this 
expectation is a triple sum involving moments up to order 3 N. Hence, the knowledge of 
moments yields not only the orthonormal set of polynomials but also the triple products 
required at each iterative step. 

In our numerical computations, we prefer to utilize a modified Gram-Schmidt algorithm 
as the classical approach is overly sensitive to roundoff errors in cases of high dimension¬ 
ality m- We refer the reader to [12] for other algorithms, e.g., Stieltjes and modified 
Chebyshev methods. 

3.2.3 Initial conditions 

When the algorithm is reinitialized at the restart point tj, it needs to construct the initial 
condition in terms of chaos variables for the next iteration on [tj,tj+ 1 ]. In other words, we 
need to find the coefficients ( Uj) a in terms of Uj and £ ■. To this end, we observe that the 
first two polynomials in Uj are given by Tq (uj ) = 1 and T\{uj) = — E[-«j]), where 

a Uj is the standard deviation of the random variable Uj. Notice that a Uj > 0, because 
we assumed in the preceding section that the distribution was nondegenerate. Hence, the 
initial condition can be easily found by the relation 

Uj = E[uj]T 0 (uj) + a Uj T\{uj) = ^ (uj)a,k T a {£j) T k (uj). 

a£j,k £No 
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3.2.4 Extension to higher dimensions 

We now extend our algorithm to higher dimensions, and for concreteness, we consider two 
dimensions. Let Uj +1 = (vi,V 2 ) € M 2 be a two-dimensional random variable obtained by 
projection of the solution u onto homogeneous chaos space at time tj + \. 

The two-dimensional case requires the calculation of the mixed moments E[v[v™], l, m € 
N U {0}. In the case of independent components v\,V 2 , the moments become products of 
marginal moments and the basis T a (v 1 ,^ 2 ) is obtained by tensor products of the one¬ 
dimensional bases. For a nonlinear system of equations, the components of the solution 
typically do not remain independent as time evolves. Therefore, we need to extend the 
procedure to correlated components. 

Denoting by Z J+ i \ and -Zj+ 1,2 the corresponding chaos expansions on [tj,tj+ 1 ] so that 
v\ = Zj+i,i(£j, Uj) and V 2 = ^+1,2 (£j, Uj), we compute the mixed moments by the change 
of variables formula 


E[- 


v\ u 


1 u 2 


= E 


(«l,w)l«l V 2 


= E 




yl rym 

^ 3 + 1,1 * 3 + 1,2 


(3.5) 


where E(.) represents the expectation with respect to the joint distribution induced by the 
subscript. Note that the latter integral in ((3.5(1 can be computed with the help of previously 
computed triple products of and Uj. Incidentally, similar transformation methods were 
used previously in mm®- 

Since the components may not be independent, the tensor product structure is lost 
but the construction of orthogonal polynomials is still possible. Based on the knowledge 
of marginal moments, we first compute the tensor product T a (v\-V^)- However, this set 
is not orthogonal with respect to the joint probability measure of (ui,^)- Therefore, we 
further orthogonalize it via a Gram-Schmidt method. Note that this procedure requires 
only mixed moments E^ vlV2 \[v[v ij 1 ], which have already been computed in the previous 
step of the algorithm. It is worth noting that the resulting set of orthogonal polynomials 
is not unique as it depends on the ordering of monomials [38]. In applications, we consider 
the same ordering used for the set of multi-indices. Finally, calculation of triple products 
and initial conditions can be extended in an obvious way. 


3.3 Computational Complexity 

In this section, we discuss the computational costs of our algorithm using a vector-valued 
version of the SDE (13.11) in M. d for a fixed d € N. For each scalar SDE, we assume that 
the nonlinearity is proportional to mth order monomial with m € N. Furthermore, we 
compare costs of DgPC and Hernrite PC in the case where both methods attain a similar 
order of error. 

Throughout this section, K , N will denote the dimension and the degree in £ = (£ 1 ,..., £k) 
for Hernrite PC. Here £ represents the truncation of d-dinrensional Brownian motion; there¬ 
fore, I\ 3> d. The time interval is fixed and given by [0,1]. We also let M(K, N ) := ( K ^ N ) 
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be the degrees of freedom for the simple truncation (12.21) so that the dimension of trun¬ 
cated multi-indices Jk,n becomes M. For the DgPC method, we divide the interval into 
n > 1 identical subintervals so that At = n _1 . We now slightly change the notation 
of previous sections and denote by A*, IV* the dimension and degree of polynomials of 
= (£jK*+ i, • • • ,^jK„+K„) used in DgPC approximation for each 1 < j < n. With addi¬ 
tional d variables at each restart, the dimension of the multi-index set for DgPC becomes 
M(A*, IV*)M(d, IV*) due to the tensor product structure. Further, let h < 1 and Q > 1 
denote the time step and global order of convergence for the time integration method 
employed in Hermite PC, respectively. 


3.3.1 Computational costs 

With the above notation, we summarize the computational costs in Table [1] All terms 
should be understood in big O notation. 


Flops 

DgPC 

Hermite PC 

Offline 

AT xM(AT,W) 3 

K x M(K, IV) 3 

Time evolution 

dxh~ l xn 1 /f x (M(A*,iV*)M(d, IV*)) x 

dxh~ 1 xM(K, IV)x[l+ 


[1 + (m - 1) x (M(A*, IV*)M(d, IV*)) 2 ] 

(m - 1) x M(K, IV) 2 ] 

Moments 

n x (M(AT*,IV*)M(d,IV*)) 3 x [(6 d - 
3d 2 ) x IV* + {d - 1) x M(d, 3IV*)] 


Gram-Schmidt 

n x M(d, IV*) 3 


Triple products 

n x Af(d,AI*) 6 


Initials 

n x d x (M(A*, IV*)M(d, IV*)) 



Table 1: Total computational costs. 


The estimates in Table |T| are obtained as follows. Triple products for £ with a, /3 ,7 E 
Jk.n can be calculated by the equation 


E[T a (07M0T 7 (0] =E 


K I< K 


i =1 j =1 k =1 


K 

2=1 


Thus, the offline cost is of order K x M(K,N ) 3 assuming that one-dimensional triple 
products can be read from a precomputed table. 

Since a time discretization with error 0(h £) is employed for Hermite PC, then for DgPC 
in each subinterval, time steps of order should be used to attain the same global 

error. Without nonlinearity, the total cost in Hermite PC for evolution of a d-dimensional 
system becomes d x h~ l x M(K,N). Due to the functional L(u) oc u m ,m > 1, the cost 
should also include the computation of u m at each time integration step. This requires 
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additional d x h~ 1 x (m — 1) x M(A', IV ) 3 work (see the computation of moments below). 
Hence, the corresponding total cost of evolution becomes the sum of these costs. 

The coefficients of the kth power of a single projected variable Uj are given by 

(«J) 7 = WaTpT^], ?<k< 31V*, 

a,/3 

assuming that, after each multiplication, the variable is projected onto the PC basis. Here, 
a, f3 ,7 € < 8 > Jd,N,- Thus, the computation of marginal moments of all variables re¬ 

quires dx 31V* x (M(iV*,lV*)M(d, IV *)) 3 calculations at each restart time in DgPC. Since we 
use the same ordering for multidimensional moments as J^d, 3 N *, computing joint moments 
further needs (d — 1) x (M(d, 31V*) — 3dZV*) x (M(iV*, lV*)M(d, IV *)) 3 amount of work. 

The Gram-Schmidt procedure costs are cubic in the size of the Gram matrix, which 
has dimension M(d,lV*) in DgPC. Each triple product ¥.[T a (uj)T^(uj)Tj(uj)], a, /3 ,7 € 
Jd,N„, is a triple sum; therefore, the total cost is proportional to M(d, IV*) 6 . Also, initial 
conditions can be computed by orthogonal projection onto the DgPC basis, which costs 
M (A'*, IV *)M(d, IV*) per dimension. 

Note that although we employed simple truncation (|2.2I) for multi-indices, sparse trun¬ 
cation techniques can be utilized to further reduce the costs of both computing moments 
and evolution; see [Sana ee]. 

3.3.2 Error bounds and cost comparison 

We now discuss the error terms in both algorithms. We posit that the Hermite PCE 
converges algebraically in N and K and that the error at time t satisfies 

\\u - u pce \\ L 2 < t^N-v + K~ x ), (3.6) 

for some constants rj, A > 0 and 5 > 1 depending on the SDE. Here A < B denotes 
that A < cB for c > 0 with A, B > 0. This assumption enforces an increase in the 
degrees of freedom IV, K if one wants to maintain the accuracy in the long term. Algebraic 
convergence in K and the term stems from the error bound (12.41) . We do not show 
errors resulting from time discretization in (13.61) since we already fixed those errors to the 
same order in both methods. Incidentally, even though we assumed algebraic convergence 
in IV, exponential numerical convergence, depending on the regularity of the solution in £, 
is usually observed in the literature [3U S3 Eh 22, 07] :3T . 

Although it is usually hard to quantify the constants rj, A, 5, this may be done for 
simple cases. For instance, for the Ornstein-Uhlenbeck (OU) process (14.31) . using the exact 
solution and Fourier basis of cosines, we can obtain the parameters 5 = 2 and A = 3/2 for 
short times. The parameter 77 does not play a role in this case since the SDE stays Gaussian 
and hence we take N = 1. We also note that for complex dynamics, r/ may depend on N ; 
see EH El] in the case of the stochastic Burgers equation and advection equations. 
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Now, we fix the error terms in (13.61) to the same order 0(e), e > 0, by choosing 
A = 0(K X ^). Since DgPC uses truncated £ of dimension K* and polynomials of degree 
IV* in each subinterval of size At = n —1 , the L 2 error at t = 1 has the form 

I \n - u dgpc \\ l2 < (A"*? + if" A ) . (3.7) 

Thus, to maintain the same level e of accuracy, we can choose 

n X ~ 5 K~ x = K~ x and = K~ x . 

From Table [H we observe that to minimize costs for DgPC, we can take AT* = d and A* = 1 
and maximize n according to the previous equation as 

n = A'^i. (3.8) 

With these choices of parameters, the total computational cost for DgPC is of algebraic 
order in K and in general, is dominated by the computation of moments. 

For Hermite PC with large K and N = 0(K X ^ ,J ) (or N = 0(\/rj\ogK) in the case of 
exponential convergence in N ), both offline and evolution stages include the cubic term 
M(K, A) 3 . Both costs increase exponentially in K. Therefore, the asymptotic analysis 
suggests that DgPC can be performed with substantially lower computational costs using 
frequent restarts (equation (13.81) 1 in nonlinear systems driven by high dimensional random 
forcing. 

3.4 Convergence results 

We now consider the convergence properties of our scheme as degrees of freedom tend to 
infinity. 

3.4.1 Moment problem and density of polynomials 

There is an extensive literature on the moment problem for probability distributions; see [U 
mmm and their references. We provide the relevant background in the analysis of DgPC. 

Definition. (Hamburger moment problem) For a probability measure p on (R, the 

moment problem is uniquely solvable provided moments of all orders n{dx), k € No, 
exist and they uniquely determine the measure. In this case, the distribution p, is called 
determinate. 

There are several sufficient conditions for a one-dimensional distribution to be deter¬ 
minate in Hamburger sense: these are compact support, exponential integrability and 
Carleman’s moment condition mm- For instance, Gaussian and uniform distributions 
are determinate, whereas the lognormal distribution is not. The moment problem is in¬ 
trinsically related to the density of associated orthogonal polynomials. Indeed, if the 
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cumulative distribution F n of a random variable u is determinate, then the correspond¬ 
ing orthogonal polynomials constitute a dense set in L 2 (R, B(R), F u ) and therefore also in 
L 2 (fi, P) mm 10]. Additionally, finite dimensional distributions on R d with compact 
support are determinate; see m- 

Now, let £ = (Ci)ieN denote an independent collection of general random variables, 
where each Q (not necessarily identically distributed) has finite moments of all orders and 
its cumulative distribution function is continuous. Under these assumptions, [HJ proves the 
following theorem. 

Theorem 3.2. For any random variable r) E L 2 (fl, <r(£), P), gPC converges to g in L 2 if 
and only if the moment problem for each Q, i E N, is uniquely solvable. 

This result generalizes the convergence of Hermite PCE to general random variables 
whose laws are determinate. Notably, to prove L 2 convergence of chaos expansions, it 
is enough to check one of the determinacy conditions mentioned above for each one¬ 
dimensional ({. 

We now consider the relation between determinacy and distributions of SDEs. Consider 
the following diffusion 

du s = b(u s ) ds + cr(u s ) dW s , u(0) = uq, sE[0, t\, (3-9) 

where W s is d-dimensional Brownian motion, uq E R rf , and b : W l —>• R d ,a : R d —>• R dxd 
are globally Lipschitz and satisfy the usual linear growth bound. These conditions imply 
that the SDE has a unique strong solution with continuous paths [25] . Determinancy of 
the distribution of u s is established by the following theorem; see [23 [HI E] for details. 

Theorem 3.3. The law of the solution of (13.91) is determinate if sup x ||crcr T (x)|| is finite. 

3.4.2 L 2 convergence with a finite number of restarts 

Consider the solution of the SDE (13.91) written as 

u s = F s (u 0 ,£ o), (3.10) 

where F s : x M°° —>• M. d is the exact evolution operator mapping the initial condition 

uo and = (£i>£ 2 >- ■ ■) the solution u s at time s > 0. Here, with a slight change of 
notation, represents Brownian motion on the interval [0, s]. Note that future solutions 
u s+T , r > 0, are obtained by F s (u r ,£ x +T ), where £* +r denotes Brownian motion on the 
interval [r, r + s]. 

Even though the distribution of the exact solution u s is determinate under the hy¬ 
potheses of Thm. 13.31 it is not clear that this feature still holds for the projected ran¬ 
dom variables. To address this issue, we introduce, for R E M + , a truncation function 
X r E C~(R d -)• R d ) 


Xr(u) ■ 


u, when u E Ar, 

0, when u E \ A^r, 
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where Ar := nf=i[ — R,R] C M d , and such that xr{ u ) decays smoothly and sufficiently 
slowly on A C R such that the Lipschitz constant Lip(y^) of xr equals 1; i.e., for | • |, the 
Euclidean distance in M rf , | xr( u ) — Xr ( v )I < \u — v\. Such a function is seen to exist by 
appropriate mollification for each coordinate using the continuous, piecewise smooth, and 
odd function x{ u i) defined by R— \u\ — R\ on [0, 2 R\, and extended by 0 outside [—2 R, 2R\. 
This truncation is a theoretical tool that allows us to ensure that all distributions with 
support properly restricted on compact intervals remain determinate; see m- 

We can now introduce the following approximate solution operators for R £ M+ and 
M £ N 0 : 


:= XR°F s (u,e 0 ) : x M°° ^ A 3 r, (3.11) 

M -1 

F s M ’ R (u,eo)--=P M °F s R (u,e o )= g), (3.12) 

i =o 

where P A1 denotes the orthogonal projection onto polynomials in u and £q, and M is the 
total number of degrees of freedom used in the expansion. Throughout this section, we use 
a linear indexing to represent polynomials in both u (restricted on A 3 r) and the random 
variables 

The main assumptions we imposed on the solution operator F s are summarized as 
follows: 


i) E^g|F s (0, £g)|P < C s , where 0 < C s < 00 and p = 2 + e with e > 0. 

ii) E^g|F s (u,^o) — F s (v ,£§)| p < pf \u — v\ p , where u,v £ M d , 0 < p s < 00 , and p = 2 + e 
with e > 0. 

iii) For e > 0 and R > 0, there is M(e,R,s ) so that E^g|F s K (u,^o) — £q)| 2 — £ i 

where u £ A^r. 


Assumption [iT| is a stability estimate ensuring that the chain remains bounded in pth norm 
starting from a point in (here 0 without loss of generality owing to[iI)|). Assumption |h)| 
is a Lipschitz growth condition controlled by a constant p s . These assumptions involve the 
L p norm, and hence are slightly stronger than control in the L 2 sense. Note that F^ also 


satisfies assumption ii) with the same constant p s since Lip(y^j) = 1. Finally, assumption 


iii) is justified by the Weierstrass approximation theorem for the u variable in A^r and by 


the Cameron-Martin theorem to handle the £ variable. 

Consider now the following versions of the Markov chains: 


Uj+i '■= F s (uj,£j), j = 0 , 1 , 

uf+i ■■= F ^(uf, £j), u$ = xrM, j = 0,..., n - 1, 

M,R tpM,R/ m ,R f \ M,R ( \ • n , 

Uj +1 := F s ’ ( Uj , Zj), u 0 = xr{u 0 ), J = 0 ,... , n - 1 , 


(3.13) 

(3.14) 

(3.15) 
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where uq € (possibly random and independent of the variables £) and = ^ +1 ^ s is a 
sequence of i.i.d. Gaussian random variables representing Brownian motion on the interval 
+ l)s]. We also use the notation £q S to denote Brownian motion on the interval 
[0,js). Then we have the first result. 


Lemma 3.4. Assume \T)\ ii), and in) hold, and let n E N he finite. Suppose also that 
E|uo| p = Co < oo. Then, for each e > 0, there exists R £ M + depending on n, s, and Cq 
such that K\iij — u^\ 2 < e for each 0 < j < n. 


Proof. Let e > 0 be fixed. Using properties [1)1 and ii), we observe that 


E,(j+l)s Ws ! 

SQ 


< 2P- 1 



- C s (0,Cj 


+ 

so J 


l^(o, 


< 2 P_1 



where for all j, and are identically distributed. Then, by induction, we obtain 

E £ (j+i).|«j+i| p < (2 v ~V s y +l E|«or + CP n < C s ^ Co , o < j < n - 1, (3.16) 

“OiSO 

where the constant C St n : c 0 is bounded. The last inequality indicates that pth norm of the 
solution grows with j, possibly exponentially, but remains bounded. 

For Uj = F js {u 0 ,£ J 0 s ), we note that E|ttj| 2 = E„ o \ u j\ 2 since Uj is independent of 
future forcing. By Markov’s inequality and 43.161) . we get 

Elu -I 2 

P( Uj eAfi) < <C s , n ,c 0 R~ 2 , 1 <j<n. (3.17) 

Let 1a be the indicator function of the set A. Using ()3. 161) and (13.171) . we compute the 
following error bound 


E K - XRUj I 2 = E[(l {u . eAfl} + 1 {uj&A c R })\uj - XRUjl 2 ], 1 <j<n, 

< 4E[l KeA c ;} |u i | 2 ] < A(E\ Uj \P) 2 /PP( Uj € Afiffi p = 2 + e,q = l + 2c" 1 , 
<C s , n , Co R- 2/q . (3.18) 


Then, using property ii) gives 

E^.Juj - uf | 2 = E^JF^ Uj - i ,^-.!) - 

< (! + - XRUj \ 2 

+ (1 + - F^uf^Z^l 2 , 5 > 0 . 

< (1 + < 5 -1 )| uj - XRUj\ 2 + (1 + 8)p 2 s\uj -1 - uf_f\ 2 . 
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Taking expectation with respect to remaining measures and using (13.1811 yields 

nuj - uf\ 2 < (1 + 5)p 2 E\u j - l - | 2 + (1 + <5 -1 )C's i7 i j c' 0 R~ 2/q , 

3-1 

< ((1 + 5)p 2 s ) j E\u 0 - u R \ 2 + (1 + S-^Cg^Co R~ Vq X)((l + %»)*■ ( 3 - 19 ) 

i =0 


Setting 5 = 1 in (|3.19l) entails 

E| u d - uf\ 2 < 2>f E | Uo _ u Rf + C SiriiCo ^“ 2/9 - 

Since E|tto| p < oo, we also have E|rto — if) 2 = 0(R~ 2 / q ). Thus, we can choose i?(n, s, Co) € 
M + large enough such that E |Uj — uf\ 2 < £ for each 0 < j < n. □ 

Based on the preceding lemma, we prove that the chain (13.1511 converges to the solution 
of the chain (13.1311 in the setting of a finite number of restarts. 

Theorem 3.5. Under the assumptions of Lemma \3.4\ for each e > 0, there exists R(n , s, Co) 
M + and then M(R, n, s) € No such that E| Uj — zt • ’ | 2 < e for each 0 < j < n. 

Proof. Let £ > 0 be fixed. By the previous lemma, choose R > 0 such that E|uj—uf | 2 < e/4 
for 0 < j < n. Then, using calculations similar to those above, by ii) we get 

%._>? - uf ’ R | 2 < (1 + 5 )E €j _ 1 |F*(uf_ 1 ,^._ 1 ) - ,^_ 1 )| 2 

+ (1 + r 1 )E^._ 1 |Ff(uff - F s M ’ R (uf_!?,Z j _ 1 )\ 2 ,6> 0, 

< 2p 2 | u R _, - uf.f| 2 

+ 2E Cj _ 1 |Ff (uff ,^_ 1 ) - Ff’ R (uff ,^_!)| 2 , ^ = 1, 

= p. | uf_ x - tiff | 2 + 2E^_jFf (uff ,^_ 1 ) - Ff ^(uff ,^_ 1 )| 2 , 
where / > 1 and p* := 2p 2 . Then by iii) we choose M(R,n, s) sufficiently large so that 


€ 


Trp I R M,Ri2 ^ I R M,R\2 . £ P* 1 
E ^-l I U 3 - U 3 I ^ P* \ U 3~1 ~ V1 I + 4^31- 

The last inequality can be rewritten as 

E|«f - tif ’ R \ 2 < pi E|uf - uf 1 f 2 + e/4 = e/4. 


(3.20) 


Therefore, E |Uj — u 


M,R 12 


< e for each 1 < j < n. 


□ 
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3.4.3 Convergence to invariant measures and long time evolution 

Consider the setting of the solution operator to the SDE given in (13.1011 . As s increases to 
oo, the random variable u(s) may converge in distribution to a limiting random variable 
Uoo , whose distribution is the invariant measure of the evolution equation (13.101) . Although 
there are many efficient ways to analyze and compute such invariant measures (see for 
instance m), we wish to show that our iterative algorithm also converges to that invariant 
measure as degrees of freedom tend to infinity. In other words, our PCE-based method 
allows us to remain accurate for long-time evolutions-something that cannot be achieved 
without restarts. 

Now we iterate each discrete chain in (13.131) . (|3.14l) and (|3.15l) for an arbitrary j G No- 
In order to ensure long-time convergence, we need a stricter condition than 
instead the following contraction condition 


n, 


and impose 


ii’) E£g|F s (u,£o) - Fs{v,€o)\ P < fi\u - v\ p , where 0 < p s < 1, p = 2 + e, e > 0, and 
u, v G R d . 

We first prove the following result about the existence and uniqueness of an invariant 
measure for the original chain (13.131) . 


Lemma 3.6. Under conditions^)) and ii’), there exists a unique invariant measure v of the 
chain (13.131) with bounded pth moment. Moreover, if E|tto| p < oo, then K\uj\ p is bounded 
uniformly in j. 

Proof. For u G (and using that \a + b\ p < (1 + 5) p \a\ p + (1 + 5 -1 )) p |6| p ), we compute 


%|i ? s K^)l p <(l + 5) p %|F s (u,^)-F s (0,^)| p + (l + r 1 ) p E^|F s (0,^)| p , 5 > 0 , 

< ((1 +S)p s ) p \u\ p + (l + S- 1 ) p C s , 

= p* \u\ p + C, 


where 6 is chosen so that p* := ((1 + S)p s ) p < 1 and C < oo. Thus, there exists a 
Lyapunov function V(u) = \u\ p with a constant p* < 1. This in turn implies the existence 
of an invariant measure v with finite pth moment for the process (13.131) ; see [I7t Corollary 
4.23]. Uniqueness also follows from assumption ii’) see [T7 . Theorem 4.25]. 


Let vq be a random variable with law u and independent of u$. Consider the chain 
Vj .|_i = F s (vj,£j) started from vq. Note Vj ~ v for all j G Nq. Then, from the bound 


E| Uj - Vj\ p < (pP) 3 E|u 0 - u 0 | p , 


we conclude that sup^ E|uj| p < oo. □ 

The following theorem establishes the exponential convergence of the PC chain (13.151) 
to the chain (13.131) as time j increases. 
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Theorem 3.7. Assume^, ii’) and Hi) hold, and E|rto| p < oo. Then, for each e > 0, there 
exists R > 0 independent of j, and then M(R, s ) € No, such that K\uj — u^’ R \ 2 < e for all 
j G N 0 . 


Proof. Let e > 0 be given. By Lemma [3. 6 1 is bounded uniformly in j. The inequality 

(13.181) holds with a constant independent of j. Then, choosing 6 > 0 so that (1 + 5)p 2 < 1 
in (13.191) implies that there exists a constant R(s, is) > 0 independent of j such that 
E |Uj — Uj 1 ) 2 < e/4. 

As in the proof of Theorem 13.51 we choose 5 > 0 so that p* := (1 + 6)p 2 < 1. Then, 
with M (if, s ) € Nq large enough, (13.201) is replaced by 


E. 


1 


— u 


M,R i2 
j 


< P* 


I R M,R 12 \ ■ \ i 

l«i-l - Vl I +7 j ^ L 


Therefore, 


E\uf-u?’ R \ 2 < pi E\u«-u» 


+ ~ P*)i l ~ P*) 1 =^/ 4 , j> !• 


This concludes our proof of convergence of the DgPC method to the invariant measure 
is. □ 


Let us conclude this section with the following remark. 

Remark 3.8. It would be desirable to obtain that the chain (corresponding to R = oo, 
or equivalently no support truncation) u A T +l = £ ; -) remains determinate. We were 

not able to do so and instead based our theoretical results on the assumption that the true 
distributions of interest were well approximated by compactly supported distributions. In 
practice, the range of M has to remain relatively limited for at least two reasons. First, 
large M rapidly involves very large computational costs; and second, the determination of 
measures from moments becomes exponentially ill-posed as the degree of polynomials N 
increases. For these reasons, the support truncation has been neglected in the following 
numerical section since, heuristically, for large R and limited N, the computation of (low 
order) moments of distributions with rapidly decaying density at infinity is hardly affected 
by such a support truncation. 

4 Numerical Experiments 

In this section, we present several numerical simulations of our algorithm and discuss the 
implementation details. We consider several of the equations described in j5( OH] to show 
that PCE-based simulations may perform well in such settings. We mostly consider the 
following two-dimensional nonlinear system of coupled SDEs: 


du(s ) = — {b u + a u v(s))u(s ) ds + f{s) ds + a u dW u (s), 
dv(s ) = — (b v + a v u(s)) v(s ) ds + a v dW v (s ), 


(4.1) 
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where a Ul a v > 0, b u . b v > 0 are damping parameters, <r n ,< tr v > 0 are constants, and W u 
and W v are two real independent Wiener processes. 

The system (14.11) was proposed in [13] to study filtering of turbulent signals which 
exhibit intermittent instabilities. The performance of Hermite PCE is analyzed in various 
dynamical regimes by the authors in [5], who conclude that truncated PCE struggle to 
accurately capture the statistics of the solution in the long term due to both the truncated 
expansion of the white noise and neglecting higher order terms, which become crucial 
because of the nonlinearities. For a review of the different dynamical regimes that (14.11) 
exhibits, we refer the reader to USE]- 

For the rest of the section, t £ K stands for the endpoint of the time interval while 
At = t/n denotes the time-step after which restarts occur at tj = jAt. Moreover, K 
denotes the number of basic random variables used in the truncation of the expansion 
(12.3j) . In the presence of two Wiener processes it will denote the total number of variables. 
We slightly change the previous notation and let N and L denote the maximum degree of 
the polynomials in £ and Uj, respectively. Recall that Uj is the projected PC solution at 
tj. Furthermore, following mm we choose the orthonormal bases for L 2 [tj-i , tj] as 


mi(s) 


y/Tj tj -1 


mis) 



( (»- l)ra \ 

V tj — tj- 1 J 


i > 2, s € [tj-i,tj], 

(4.2) 


Other possible options for rrii(s) include sine functions, a combination of sines and cosines, 
and wavelets. We refer the reader to [23] for details on the use of wavelets to deal with 
discontinuities for random inputs. 

Assuming that the solution of (14.11) is square integrable, we utilize intrusive Galerkin 
projections in order to establish the following equations for the coefficients of the PCE: 

ii a — b u u a + a u {uv^ql T / 5q,o T ct u _F[I / t /, 1( T' Q ,], 

Vex — b v V a T d v (uV^cx + (T v E [I4C Tex ] • 


This system of ODEs is then solved by either a second- or a fourth-order time integration 
method. Finally, we note that other methods are also available to compute the PCE 
coefficients such as nonintrusive projection and collocation methods 

In most of our numerical examples, the dynamics converge to an invariant measure. 
To demonstrate the convergence behavior of our algorithm, we compare our results to 
exact second order statistics or Monte Carlo simulations with sufficiently high sampling 
rate N samp (e.g., Euler-Maruyama or weak Runge-Kutta methods [20] 4 where the exact 
solution is not available. Comparisons involve the following relative pointwise errors: 


fmean(s) : — 


Mpc e(s) — ^exact( s ) 


/ i exact( s ) 


evar(s) 


*pce(») “ (s) 


a 


exact 


(*) 
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where fi. a 2 represents the mean and variance. In the following figures, we plot the evo¬ 
lution of mean, variance, e me an and e V ar using the same legend. In addition, we exhibit 
the evolution of higher order cumulants, which will be denoted by k to demonstrate the 
convergence to steady state as time grows. Finally, in our numerical computations, we 
make use of the C++ library UQ Toolkit [Tj. 

4.1 Numerical examples 

Example 4.1. As a first example, we consider the one-dimensional Ornstein-Uhlenbeck 
(OU) process 


dv(s) = — b v v(s) ds + a v dW v (s), s 6 [0,1], u(0) = vq, (4-3) 

on [0, 3] with the parameters b v = 4, a v = 2, Vo = 1. We first present the results of Hermite 
PCE. 



(a) mean (b) variance (c) emean (d) evar 


Figure 2: Hermite PCE. 

Figure [2] shows that Hermite PC captures the mean accurately but approximation for 
the variance is accurate only at the endpoint. This is a consequence of the expansion 
in (12.41) . which violates the Markov property of Brownian motion and is inaccurate for 
all 0 < s < t, while it becomes (spectrally) accurate at the endpoint t. Using frequent 
restarts, these oscillations are significantly attenuated as expected; see Figure |3(b)| This 
oscillatory behavior could also be alleviated by expanding the Brownian motion as in (|2.4I) 
on subintervals of (0,1). Note that the global basis functions T^(£) may also be filtered by 
taking the conditional expectation E[T^,(^)|+)]' ], where J 7 ] 1 ' is the u-algebra corresponding 
to Brownian motion up to time s [22, 24]. We do not pursue this issue here but note that 
the accuracy of the different methods should be compared at the end of the intervals of 
discretization of Brownian motion. 

Next, we demonstrate numerical results for DgPC taking N = 1, L = 1 and a varying 
K = 4, 6,8. Figure [3] shows that as the exact solution converges to a steady state, our 
algorithm captures the second order statistics accurately even with a small degrees of 
freedom utilized in each subinterval. Moreover, although we do not show it here, it is 
possible, by increasing L, to approximate the higher order zero cumulants k*, i = 3,4, 5, 6, 
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with an error on the order of machine precision, which implies that the algorithm converges 
numerically to a Gaussian invariant measure. 



(a) mean 


(b) variance 


(c) emean 


(d) evar 


Figure 3: DgPC At = 0.2. 


In Figure [H we illustrate an algebraic convergence of the variance in terms of the 
dimension K for t = 3 and t = 15. DgPC uses the same size of interval At = 0.2 for both 
cases. For comparison, we also include the convergence behavior of Hermite PCE. Values 
of K are taken as (2,4, 6,8) for Figures 4(a) and 4(c) and K = (8,16,32,64) for Figures 


4(b) and |4(d)| We deduce that our algorithm maintains an algebraic convergence of order 
0(K ~ 3 ) for both t = 3 and t = 15 whereas the convergence behavior of Hermite PCE 
drops from 0(K~ 3 ) to 0(K~ 2 ) as time increases. This confirms the fact that the degrees 
of freedom required for standard PCE to maintain a desired accuracy should increase with 
time. 



(a) DgPC (t = 3) 


(b) Hermite PC (t = 3) 


(c) DgPC (t = 15) (d) Hermite PC (t = 15) 


Figure 4: K convergence. 


Example 4.2. We now introduce a nonlinearity in the equation so that the damping term 
includes a cubic component: 

dv = —(v 2 + l)v ds + a v dW v , u(0) = 1. 

Figure [5] displays several numerical simulations for the degree of polynomials in £ given 
by N = 1,2,3 and a v = 2. This is compared with the weak Runge-Kutta method using 
Nsamp = 200000 and dt = 0.001. We observe that increasing N improves the accuracy of the 
solution to this nonlinear equation as expected. Moreover, Table [2] presents a comparison 
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(a) mean 


(b) variance 


(c) cumulants 


(d) cumulants 


Figure 5: DgPC At = 0.25. 


for statistical cumulants between our method with K = 5, IV = 2, L = 4 and Monte 
Carlo (MC) at time t = 4. The (stationary) cumulants of the invariant measure are also 
estimated by solving a standard Fokker-Planck (FP) equation [25, 41] for the invariant 
measure. We conclude this example by noting that the level of accuracy of approximations 
in DgPC for cumulants is similar to that of the MC method. 



«i 


« 3 


k 5 

k 6 

DgPC 

3.58E-5 

7.33E-1 

-2.08E-5 

-3.37E-1 

6.02E-5 

9.35E-1 

MC 

-2.53E-3 

7.33E-1 

7.85E-4 

-3.38E-1 

-3.34E-3 

9.58E-1 

FP 

0 

7.33E-1 

0 

-3.39E-1 

0 

9.64E-1 


Table 2: Cumulants. 


Example 4.3. As a third example we consider an OU process (14.31) in which the damping 
parameter is random and uniformly distributed in [1,3], i.e. b v ~ £7(1,3). This is an 
example of non-Gaussian dynamics that may be seen as a coupled system for (v,b v ) with 
db v = 0. 


We consider a time domain [0,8] and divide it into n = 40 subintervals. The initial 
condition is normally distributed vq ~ 1V(1.0.04) II W v and a v = 2. In the next figure, 
we compare second order statistics obtained by our method to Monte Carlo method for 
which we use the Euler-Maruyama scheme with the time step dt = 0.002 and sampling 
rate N samp = 1000 x 1000 implying 10 6 samples in total. We stress again that this problem 
is essentially two-dimensional since the damping is random. 

As expected, the mean decreases monotonically and is approximated accurately by 
the algorithm. The estimation of the variance becomes more accurate as N increases. 
Furthermore, Figures 6(c) and |6(d)| show that the cumulants become stationary for long 
times indicating that the numerical approximations converge to a measure which is non- 
Gaussian. 

Table [3] compares the first six cumulants obtained by our algorithm at time t = 8 
with K = 4, N = 3, L = 4 to the Monte Carlo method. For further comparison, we 
also provide cumulants obtained by averaging Fokker-Planck density with respect to the 
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Figure 6: DgPC At = 0.2. 


known, explicit, distribution of the damping. It can be observed from the following table 
that both our algorithm and Monte Carlo capture cumulants reasonably well although the 
accuracy degrades at higher orders. 



Kl 

K2 

*3 

K 4 

k 5 

K 6 

DgPC 

1.68E-5 

1.10 

3.80E-5 

3.78E-1 

8.72E-5 

5.04E-1 

MC 

-1.78E-4 

1.10 

-3.25E-3 

3.70E-1 

-6.34E-2 

4.65E-1 

FP 

0 

1.10 

0 

3.79E-1 

0 

5.29E-1 


Table 3: Cumulants. 


The above calculations provide an example of stochasticity with two distinct compo¬ 
nents: the variables £ that are Markov and can be projected out and the non-Markov 
variable b v that has long-time effects on the dynamics. The variables (v,b v ) are strongly 
correlated for positive times. PCE thus need to involve orthogonal polynomials that reflect 
these correlations and cannot be written as tensorized products of orthogonal polynomials 
in v and b v . 

Example 4.4. We demonstrate that our method is not limited to equations forced by 
Brownian motion. We apply it to an equation where the forcing is a nonlinear function of 
Brownian motion: 


dv(s) = —b v v{s) ds + a v d(W 2 (s ) — s), u(0) = 1, 

with parameters b v = 6, cr v = 1. To depict the effect of different choices of number of 
restarts, we take At = 0.5,0.3,0.1 using K = 5, N = 2, L = 2 in each expansion. 

From Figure [3 we observe that second order statistics are captured accurately and 
an increasing number of restarts provides better approximations as anticipated. Although 
the system does not converge to a steady state (variance of v(s) increases linearly), this 
example illustrates that DgPC is able to capture behaviors of solutions where the forcing 
is a nonlinear function of Wiener process. 




















DgPC 


25 



(a) mean 


(b) variance 


(c) emean 


(d) evar 


Figure 7: DgPC At = 0.5, 0.3,0.1. 


Example 4.5. This example concerns the nonlinear system (14.11) . where the dynamics of 
u exhibit intermittent non-Gaussian behavior. A time dependent deterministic periodic 
forcing is considered with the second equation being an OU process, i.e. a v = 0 in (14.11) . 
and the parameters are taken as b u = 1.4, b v = 10 , a u = 0.1, a v = 10 , a u = 1, 
f{t) = 1 + l.lcos(2t + 1) + 0.5cos(4f), with initial conditions uq = N( 0. a 2 ^/8b u ) II vn = 
N(0,a 2 /8b v )\ see also [5|. The initial variables are independent of each other and of the 
stochastic forcing. Intermittency is introduced by using an OU process, which acts as a 
multiplicative noise fluctuating in the damping. The second order statistics for this case 
( a v = 0) can be derived analytically |Hj and we plot them in black in the following figures 
using quadrature methods with sufficiently high number of quadrature points. 

We first depict the results for u obtained by standard Hermite PCE in the first row 
of Figure 0 It can be observed that truncated PC approximations are accurate for short 
times. However, they quickly lose their accuracy as time grows. This is consistent with the 
simulations presented in [5]. The second row of the same figure shows that the accuracy 
is vastly improved in DgPC using At = 0.1, and second order statistics are accurate up 
to three or four digits. Moreover, errors in the variance oscillate around 0(1CU 3 ) for 
L = 3 throughout the time length suggesting that the approximation retains its accuracy 
in the long term, which is consistent with theoretical predictions. Our algorithm easily 
outperforms standard PCE in capturing the long-time behavior of the nonlinear coupled 
system (14.11) in this scenario. 


Example 4.6. For this scenario, we consider the system parameters as a u = 1 , a v = 0, 
b u = 1.2, b v = 0.5 ,a u = 0.5, cr v = 0.5 without the deterministic forcing / = 0 and with the 
initials uq = N(l, a^/8b u ) _lLuo = N(0,al/8b v ). In this regime, the dynamics of u(s) are 
characterized by unstable bursts of large amplitude |5j. 

Second order statistics obtained by Hermite PCE and our algorithm are presented in 
Figure 0 where the first two parts (Figures 9(a) and 9(b)) correspond to Hermite PCE. 
We observe from Figure [9 (b)| that the relative errors for standard Hermite expansions are 
unacceptably high. On the other hand, as the degrees of freedom is increased in DgPC, 
there is a clear pattern of convergence of second order statistics; see Figures 9(e) and 
|9(f)| Note, however, that short-time accuracy is better than accuracy for long times. This 












































DgPC 


26 






(a) mean(u) 


(b) var(u) 


(c) emean 


(d) e V ar 


var(u) 
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Figure 8: Hermite PC & DgPC At = 0.1. 


behavior may be explained by the onset of intermittency as nonlinear effects kick in after 
short times. Further, Figure [9(g)| shows the error behavior of variance at t = 10 as N and 
L vary, and K is fixed. The rate of convergence is consistent with exponential convergence 
in N and L as the logarithmic plot is almost linear. For similar convergence behaviors, see 
ISl EH [37]. 






(a) mean(u) (Hermite) (b) var(u) (Hermite) 


(c) mean(u) 


(d) var(u) 



Figure 9: Hermite PC <fc DgPC At = 0.1. 


Finally, in Figure flOl we present the evolution of the first few cumulants obtained 
by DgPC, which shows that the system converges to a steady state distribution as time 
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(a) l si &2’ ld order (b) 3 rd order 


(c) 4 th order 



(d) 5 th order 


(e) 6 th order 


(f) kurtosis excess 


Figure 10: Bivariate cumulants (a v = 0). 


increases. Bivariate cumulants are ordered so that first and second subscripts correspond 
to u and v, respectively. Figure |10(f)| shows kurtosis excess for u, v and clearly indicates 
that the dynamics of v stay Gaussian, whereas those of u converge to a non-Gaussian state. 

Example 4.7. As a final example, using the same set of parameters of the previous 
example, we introduce a small perturbation to the second equation and set a v = 0.03 in 
(ED- Another choice of perturbation as a v = 0.01 was considered before in [5]. Comparing 
Figurs [TUI and fTTl we see that evolutions of higher order cumulants are perturbed in most 
cases as expected. In this case, it took a longer transient time to converge to an invariant 
measure as additional nonlinearity was introduced into the system. 


5 Conclusions 

Polynomial chaos expansions (PCE) applied to the simulation of evolution equations suffer 
from two drawbacks: in the presence of stochastic forcing, the dimension of randomness is 
too large, and the long-time solution may not be sparsely represented in a fixed PCE basis 
urn- In the setting of Markovian random forcing, we have proposed a restart method 
that addresses the two aforementioned drawbacks. Such restarts of the PCE, which we 
called here Dynamical generalized Polynomial Chaos (DgPC), allow us to both to keep the 
number of random variables small and to obtain a solution that remains reasonably sparse 
in the evolving basis of orthogonal polynomials, as was done earlier in [El- 

Following |5j, we applied DgPC to the long-time numerical simulation of various stochas¬ 
tic differential equations (SDEs). To demonstrate the ability of the algorithm to reach long¬ 
time solutions, we computed invariant measures for SDEs that admit one and found a very 
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(d) 5 th order 


(e) cumulants of v 


(f) kurtosis excess 


Figure 11: Bivariate cumulants ( a v = 0.03). 


good agreement between DgPC and other standard methods such as Monte Carlo-based 
simulations. We also presented a simple theoretical justification for such a convergence. 

The main computational difficulty in DgPC, as in most gPC-based methods, is the 
estimation of the orthogonal polynomials of evolving arbitrary measures. Our method, 
as in m, is based on estimating moments of the multivariate distribution of interest 
and estimating orthogonal polynomials by a Gram-Schmidt procedure. The bottleneck in 
such estimations is the cost of computing moments. This is similar to the cost of solving 
differential equations by PCE methods, where moment estimations are also the most costly 
step. However, in DgPC, such estimations cannot be performed offline. 

From a theoretical point of view, we need to ensure that the evolving measures remain 
determinate so that their orthogonal polynomials span square integrable functionals. Since 
distributions with compact support are determinate, our theoretical results have been 
applied in the setting where SDE solutions are well approximated by compactly supported 
distributions. In this connection, we note that the calculation of orthogonal polynomials 
from knowledge of moments is an ill-posed problem, which is a serious impediment to gPC 
methods in general. 

Extension of the method to larger systems than those considered here, including to sys¬ 
tems obtained by solving stochastic partial differential equations (SPDEs) is challenging. 
The method works reasonably well for low dimensional SDEs and it needs modifications 
for larger systems. However, the general idea remains applicable. To keep the number of 
uncertain variables in check, the method has to be coupled with a compression technique 
such as Karhunen-Loeve projection at each restart time. Moreover, sparse truncation tech¬ 
niques further help to reduce costs. The analysis of extensions of DgPC to larger systems, 
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faster methods to compute orthogonal polynomials, and adaptive restart procedures is the 
subject of ongoing research. 
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